CS395T Data Mining Project report One-class SVM formulations for Multiple Instance learning
نویسنده
چکیده
Multiple Instance learning (MIL) considers a particular form of weak supervision in which the learner is given a set of positive bags and negative bags. Positive bags are sets of instances containing atleast one positive example and negative bags are sets of instances all of which are negative. A number of binary SVM based solutions have been proposed to this problem like the Normalized Set Kernel of Gartner et. al, 2002 ([1]) which represents the bag as the sum of all its instances normalized by its 1 or 2-norm and the sparse MIL (sMIL) technique of Razvan and Mooney, 2007 ([2]) which improves upon NSK by considering a weaker balancing constraint. In this project I plan to look at equivalent formulations for a one-class SVM and empirically evaluate if ignoring the negative bags in the formulation is detrimental to the solution found.
منابع مشابه
IRDDS: Instance reduction based on Distance-based decision surface
In instance-based learning, a training set is given to a classifier for classifying new instances. In practice, not all information in the training set is useful for classifiers. Therefore, it is convenient to discard irrelevant instances from the training set. This process is known as instance reduction, which is an important task for classifiers since through this process the time for classif...
متن کاملOne-Class Multiple Instance Learning and Applications to Target Tracking
Existing work in the field of Multiple Instance Learning (MIL) have only looked at the standard two-class problem assuming both positive and negative bags are available. In this work, we propose the first analysis of the one-class version of MIL problem where one is only provided input data in the form of positive bags. We also propose an SVM-based formulation to solve this problem setting. To ...
متن کاملSupport Vector Machines for Multiple-Instance Learning
This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including non-linear classification via kernels, available to an a...
متن کاملAn Effective Combination Based on Class-Wise Expertise of Diverse Classifiers for Predictive Toxicology Data Mining
This paper presents a study on the combination of different classifiers for toxicity prediction. Two combination operators for the Multiple-Classifier System definition are also proposed. The classification methods used to generate classifiers for combination are chosen in terms of their representability and diversity and include the Instance-based Learning algorithm (IBL), Decision Tree learni...
متن کاملMachine Learning in Wireless Relay Channels
Our course project for CS395T has made substantial progress since the project proposal was submitted. The first phase, which consists of implementing the communication protocols and algorithms, is nearly complete, and work on the second phase, which consists of implementing and running the classifiers, is about to begin. This report details the progress we have made, the challenges we have face...
متن کامل